llr score
When Does Visual Prompting Outperform Linear Probing for Vision-Language Models? A Likelihood Perspective
Tsao, Hsi-Ai, Hsiung, Lei, Chen, Pin-Yu, Ho, Tsung-Yi
When applying transfer learning to downstream tasks, specific modifications to the pre-trained model are required. For instance, linear probing (LP) involves adjusting the linear layer in the model's penultimate layer, while full fine-tuning involves modifying all parameters in the model. However, in the emerging field of fine-tuning for transfer learning, visual prompting (VP) (Bahng et al., 2022; Chen, 2024) offers a method that does not necessitate changes to the pre-trained model. Specifically, studies such as CLIP-VP (Bahng et al., 2022) and AutoVP (Tsao et al., 2024) indicate that visual prompting is particularly suitable for out-of-distribution (OOD) datasets. In AutoVP, the authors observed that datasets with lower confidence scores, indicative of being more OOD, tend to achieve greater accuracy gains (i.e., the performance difference between VP and LP).